BINNED_STATISTIC_2D
Overview
The BINNED_STATISTIC_2D function computes a bidimensional binned statistic for data points distributed across a two-dimensional space. This function is a generalization of a 2D histogram: while a histogram counts the number of points falling into each bin, BINNED_STATISTIC_2D allows computation of various statistics—including mean, sum, median, count, standard deviation, min, and max—for the values associated with points within each bin.
This implementation wraps the scipy.stats.binned_statistic_2d function from the SciPy scientific computing library. For related functionality, see also numpy.histogram2d for standard 2D histograms and scipy.stats.binned_statistic for the one-dimensional version.
The function divides the x-y plane into a grid of rectangular bins based on the specified number of bins and optional range limits. Each data point (x_i, y_i) is assigned to a bin, and the corresponding value v_i contributes to the statistic calculated for that bin. For example, when using the mean statistic, the function computes:
\bar{v}_{\text{bin}} = \frac{1}{n_{\text{bin}}} \sum_{i \in \text{bin}} v_i
where n_{\text{bin}} is the number of points in the bin. Empty bins are represented as blank cells in the output.
The bin edges follow a half-open interval convention: for edges [e₁, e₂, e₃], the first bin includes values in [e₁, e₂) and the second bin includes [e₂, e₃]. The rightmost bin is closed on both ends to include the maximum value. Points falling outside the specified range (when xlowerlimit, xupperlimit, ylowerlimit, and yupperlimit are provided) are excluded from the computation.
Common applications include spatial data analysis, creating heatmaps of aggregated values, analyzing geographic distributions, and summarizing sensor data across a 2D field.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=BINNED_STATISTIC_2D(x, y, values, bstwod_statistic, bins, xlowerlimit, xupperlimit, ylowerlimit, yupperlimit)
x(list[list], required): Data to bin along the first dimension (x-axis).y(list[list], required): Data to bin along the second dimension (y-axis).values(list[list], required): Data on which to compute the statistic.bstwod_statistic(str, optional, default: “mean”): The statistic to compute for each bin.bins(int, optional, default: 10): Number of bins for both dimensions.xlowerlimit(float, optional, default: null): Lower bound for the x bin range.xupperlimit(float, optional, default: null): Upper bound for the x bin range.ylowerlimit(float, optional, default: null): Lower bound for the y bin range.yupperlimit(float, optional, default: null): Upper bound for the y bin range.
Returns (list[list]): 2D list of bin statistics, or error message string.
Examples
Example 1: Count statistic in 2x2 bins
Inputs:
| x | y | values | bstwod_statistic | bins | xlowerlimit | xupperlimit | ylowerlimit | yupperlimit |
|---|---|---|---|---|---|---|---|---|
| 0.1 | 2.1 | 1 | count | 2 | 0 | 1 | 2 | 3 |
| 0.1 | 2.6 | 1 | ||||||
| 0.1 | 2.1 | 1 | ||||||
| 0.6 | 2.1 | 1 |
Excel formula:
=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {1;1;1;1}, "count", 2, 0, 1, 2, 3)
Expected output:
| Result | |
|---|---|
| 2 | 1 |
| 1 | 0 |
Example 2: Mean statistic in 2x2 bins
Inputs:
| x | y | values | bstwod_statistic | bins | xlowerlimit | xupperlimit | ylowerlimit | yupperlimit |
|---|---|---|---|---|---|---|---|---|
| 0.1 | 2.1 | 10 | mean | 2 | 0 | 1 | 2 | 3 |
| 0.1 | 2.6 | 20 | ||||||
| 0.1 | 2.1 | 30 | ||||||
| 0.6 | 2.1 | 40 |
Excel formula:
=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {10;20;30;40}, "mean", 2, 0, 1, 2, 3)
Expected output:
| Result | |
|---|---|
| 20 | 20 |
| 40 |
Example 3: Sum statistic in 2x2 bins
Inputs:
| x | y | values | bstwod_statistic | bins | xlowerlimit | xupperlimit | ylowerlimit | yupperlimit |
|---|---|---|---|---|---|---|---|---|
| 0.1 | 2.1 | 1 | sum | 2 | 0 | 1 | 2 | 3 |
| 0.1 | 2.6 | 2 | ||||||
| 0.1 | 2.1 | 3 | ||||||
| 0.6 | 2.1 | 4 |
Excel formula:
=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {1;2;3;4}, "sum", 2, 0, 1, 2, 3)
Expected output:
| Result | |
|---|---|
| 4 | 2 |
| 4 | 0 |
Example 4: Median statistic in 2x2 bins
Inputs:
| x | y | values | bstwod_statistic | bins | xlowerlimit | xupperlimit | ylowerlimit | yupperlimit |
|---|---|---|---|---|---|---|---|---|
| 0.1 | 2.1 | 10 | median | 2 | 0 | 1 | 2 | 3 |
| 0.1 | 2.6 | 20 | ||||||
| 0.1 | 2.1 | 30 | ||||||
| 0.6 | 2.1 | 40 |
Excel formula:
=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {10;20;30;40}, "median", 2, 0, 1, 2, 3)
Expected output:
| Result | |
|---|---|
| 20 | 20 |
| 40 |
Python Code
import math
from scipy.stats import binned_statistic_2d as scipy_binned_statistic_2d
def binned_statistic_2d(x, y, values, bstwod_statistic='mean', bins=10, xlowerlimit=None, xupperlimit=None, ylowerlimit=None, yupperlimit=None):
"""
Computes a bidimensional binned statistic (mean, sum, median, etc.) for the input data.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic_2d.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Data to bin along the first dimension (x-axis).
y (list[list]): Data to bin along the second dimension (y-axis).
values (list[list]): Data on which to compute the statistic.
bstwod_statistic (str, optional): The statistic to compute for each bin. Valid options: Mean, Sum, Median, Count, Standard Deviation, Min, Max. Default is 'mean'.
bins (int, optional): Number of bins for both dimensions. Default is 10.
xlowerlimit (float, optional): Lower bound for the x bin range. Default is None.
xupperlimit (float, optional): Upper bound for the x bin range. Default is None.
ylowerlimit (float, optional): Lower bound for the y bin range. Default is None.
yupperlimit (float, optional): Upper bound for the y bin range. Default is None.
Returns:
list[list]: 2D list of bin statistics, or error message string.
"""
# Helper function to normalize 2D list inputs
def to2d(val):
return [[val]] if not isinstance(val, list) else val
# Convert scalars to 2D lists if needed
x = to2d(x)
y = to2d(y)
values = to2d(values)
# Flatten 2D lists to 1D
try:
flat_x = [float(item) for row in x for item in (row if isinstance(row, list) else [row])]
flat_y = [float(item) for row in y for item in (row if isinstance(row, list) else [row])]
flat_values = [float(item) for row in values for item in (row if isinstance(row, list) else [row])]
except Exception:
return "Invalid input: x, y, and values must be 2D lists or scalars of numbers."
if len(flat_x) == 0 or len(flat_y) == 0 or len(flat_values) == 0:
return "Invalid input: x, y, and values must not be empty."
if len(flat_x) != len(flat_y) or len(flat_x) != len(flat_values):
return "Invalid input: x, y, and values must have the same length."
if bstwod_statistic not in ('mean', 'sum', 'median', 'count', 'std', 'min', 'max'):
return "Invalid input: bstwod_statistic must be one of 'mean', 'sum', 'median', 'count', 'std', 'min', 'max'."
try:
nbins = int(bins)
except Exception:
return "Invalid input: bins must be an integer."
if nbins < 1:
return "Invalid input: bins must be >= 1."
# Prepare range
range_arg = None
if (xlowerlimit is not None and xupperlimit is not None and
ylowerlimit is not None and yupperlimit is not None):
try:
range_arg = [(float(xlowerlimit), float(xupperlimit)), (float(ylowerlimit), float(yupperlimit))]
except Exception:
return "Invalid input: xlowerlimit, xupperlimit, ylowerlimit, yupperlimit must be numbers."
try:
if range_arg:
res = scipy_binned_statistic_2d(flat_x, flat_y, flat_values, statistic=bstwod_statistic, bins=nbins, range=range_arg)
else:
res = scipy_binned_statistic_2d(flat_x, flat_y, flat_values, statistic=bstwod_statistic, bins=nbins)
stat = res.statistic
except Exception as e:
return f"scipy.stats.binned_statistic_2d error: {e}"
# Return as 2D list (row-major), converting NaN to empty string for empty bins
result = []
for row in stat.tolist():
result_row = []
for val in row:
if val is None or (isinstance(val, float) and math.isnan(val)):
result_row.append("")
else:
result_row.append(float(val))
result.append(result_row)
return result